-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test(robot): migrate test_replica_rebuild_per_volume_limit #2159
test(robot): migrate test_replica_rebuild_per_volume_limit #2159
Conversation
Caution Review failedThe pull request is closed. WalkthroughThe changes introduce new functionalities for managing volume replicas in the system. Two new keywords are added for deleting a specified number of replicas and enforcing a constraint on replica rebuilding. Additionally, several methods in the Changes
Assessment against linked issues
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (14)
e2e/tests/regression/test_replica.robot (4)
1-15
: Consider enhancing the documentation.While the basic documentation is present, it would be beneficial to add more details about:
- The purpose of these replica test cases
- Prerequisites for running these tests
- Expected test environment requirements
-Documentation Replica Test Cases +Documentation Test cases validating Longhorn volume replica behavior, +... including replica rebuilding limits and scheduling. +... Prerequisites: Longhorn cluster with sufficient nodes +... for replica distribution.
16-20
: Add documentation for test variables.The variables lack documentation explaining their purpose and chosen values. This information is crucial for maintainability.
*** Variables *** +# Number of test iterations ${LOOP_COUNT} 1 +# Maximum number of retry attempts for operations ${RETRY_COUNT} 300 +# Time interval between retries in seconds ${RETRY_INTERVAL} 1 +# Data engine version to use for volume creation ${DATA_ENGINE} v1
35-49
: Add soft anti-affinity validation.While soft anti-affinity is enabled, the test doesn't verify that replicas are actually scheduled on different nodes.
Add a verification step after replica creation:
And Wait for volume 0 healthy +And Verify volume 0 replicas are on different nodes And Write data to volume 0
23-25
: Consider adding error scenarios.While the happy path is well tested, consider adding test cases for:
- Node failure during replica rebuilding
- Network partition scenarios
- Disk full conditions
Would you like me to help create additional test cases for these scenarios?
e2e/libs/volume/volume.py (2)
107-108
: Add docstring to clarify node_name parameter usage.Consider adding a docstring to explain the behavior when node_name is provided vs. when it's None. This will help other developers understand how to use this method effectively.
def wait_for_replica_rebuilding_start(self, volume_name, node_name=None): + """Wait for replica rebuilding to start. + + Args: + volume_name: Name of the volume + node_name: Optional. If provided, wait for rebuilding on specific node. + If None, wait for any replica rebuilding. + """ return self.volume.wait_for_replica_rebuilding_start(volume_name, node_name)
110-111
: Add docstring to clarify node_name parameter usage.Similar to the previous method, add a docstring to explain the behavior with and without node_name parameter.
def is_replica_rebuilding_in_progress(self, volume_name, node_name=None): + """Check if replica rebuilding is in progress. + + Args: + volume_name: Name of the volume + node_name: Optional. If provided, check rebuilding on specific node. + If None, check for any replica rebuilding. + Returns: + bool: True if rebuilding is in progress, False otherwise. + """ return self.volume.is_replica_rebuilding_in_progress(volume_name, node_name)e2e/keywords/volume.resource (2)
102-104
: Add documentation for the new keyword.The implementation looks good and follows the established patterns. Consider adding documentation to describe the purpose and behavior of this keyword, similar to other keywords in the file.
+Documentation Delete specified number of replicas from a volume Delete ${count} replicas of volume ${volume_id} ${volume_name} = generate_name_with_suffix volume ${volume_id} delete_replicas ${volume_name} ${count}
199-201
: Add documentation for the new keyword.The implementation looks good and follows the established patterns. Consider adding documentation to describe the purpose and behavior of this keyword, similar to other keywords in the file.
+Documentation Verify that only one replica rebuilding process occurs at a time for the specified volume Only one replica rebuilding will start at a time for volume ${volume_id} ${volume_name} = generate_name_with_suffix volume ${volume_id} only_one_replica_rebuilding_will_start_at_a_time ${volume_name}
e2e/libs/keywords/volume_keywords.py (1)
209-223
: Improve clarity and add timeout handling.The implementation could be enhanced for better clarity and robustness:
Consider these improvements:
async def only_one_replica_rebuilding_will_start_at_a_time(self, volume_name): + timeout = 300 # Add appropriate timeout value - async def wait_for_replica_rebuilding(): - tasks = [ - asyncio.create_task(self.volume.wait_for_replica_rebuilding_start(volume_name), name=volume_name), - ] - - done, pending = await asyncio.wait(tasks, return_when=asyncio.ALL_COMPLETED) - logging(f"Observed {done.pop().get_name()} started replica rebuilding") - - await wait_for_replica_rebuilding() + try: + logging(f"Waiting for replica rebuilding to start for volume {volume_name}") + await asyncio.wait_for( + self.volume.wait_for_replica_rebuilding_start(volume_name), + timeout=timeout + ) + logging(f"Replica rebuilding started for volume {volume_name}") - assert self.volume.is_replica_rebuilding_in_progress(volume_name), \ - f"Expect {volume_name} replica rebuilding in progress" + assert self.volume.is_replica_rebuilding_in_progress(volume_name), \ + f"Expected volume {volume_name} to have replica rebuilding in progress, but no rebuilding was detected" + except asyncio.TimeoutError: + raise TimeoutError(f"Timeout waiting for replica rebuilding to start for volume {volume_name} after {timeout} seconds")e2e/libs/volume/rest.py (5)
140-140
: Useis not None
forNone
comparisonIn Python, it's recommended to use
is not None
when comparing toNone
for clarity and adherence to PEP 8 guidelines.Apply this diff to correct the comparison:
- assert rebuilding_replica_name != None, f"Waiting for replica rebuilding start for volume {volume_name} on node {node_name} failed: replicas = {v.replicas}" + assert rebuilding_replica_name is not None, f"Waiting for replica rebuilding start for volume {volume_name} on node {node_name} failed: replicas = {v.replicas}"🧰 Tools
🪛 Ruff
140-140: Comparison to
None
should becond is not None
Replace with
cond is not None
(E711)
144-144
: Replace unused loop variablei
with underscore_
The loop variable
i
is not used within the loop body. Using_
indicates that the variable is intentionally unused.Apply this diff to improve readability:
- for i in range(self.retry_count): + for _ in range(self.retry_count):🧰 Tools
🪛 Ruff
144-144: Loop control variable
i
not used within loop body(B007)
183-183
: Simplify conditional assignment ofnode_name
You can enhance readability by rearranging the conditional expression.
Apply this diff to simplify the assignment:
- node_name = replica.hostId if not node_name else node_name + node_name = node_name if node_name else replica.hostIdAlternatively, you can use the more concise:
+ node_name = node_name or replica.hostId
🧰 Tools
🪛 Ruff
183-183: Use
node_name if node_name else replica.hostId
instead ofreplica.hostId if not node_name else node_name
Replace with
node_name if node_name else replica.hostId
(SIM212)
237-237
: Simplify conditional expression in logging statementFor improved readability, rearrange the conditional expression in the log message.
Apply this diff:
- logging(f"wait for {volume_name} replica rebuilding completed on {'all nodes' if not node_name else node_name} ... ({i})") + logging(f"wait for {volume_name} replica rebuilding completed on {node_name if node_name else 'all nodes'} ... ({i})")🧰 Tools
🪛 Ruff
237-237: Use
node_name if node_name else 'all nodes'
instead of'all nodes' if not node_name else node_name
Replace with
node_name if node_name else 'all nodes'
(SIM212)
274-275
: Consistent conditional expressions in assertions and loggingTo maintain consistency and enhance readability, adjust the conditional expressions in both the logging and assertion statements.
Apply this diff:
- logging(f"Completed volume {volume_name} replica rebuilding on {'all nodes' if not node_name else node_name}") - assert completed, f"Expect volume {volume_name} replica rebuilding completed on {'all nodes' if not node_name else node_name}" + logging(f"Completed volume {volume_name} replica rebuilding on {node_name if node_name else 'all nodes'}") + assert completed, f"Expect volume {volume_name} replica rebuilding completed on {node_name if node_name else 'all nodes'}"🧰 Tools
🪛 Ruff
274-274: Use
node_name if node_name else 'all nodes'
instead of'all nodes' if not node_name else node_name
Replace with
node_name if node_name else 'all nodes'
(SIM212)
275-275: Use
node_name if node_name else 'all nodes'
instead of'all nodes' if not node_name else node_name
Replace with
node_name if node_name else 'all nodes'
(SIM212)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
e2e/keywords/volume.resource
(2 hunks)e2e/libs/keywords/volume_keywords.py
(3 hunks)e2e/libs/volume/crd.py
(2 hunks)e2e/libs/volume/rest.py
(4 hunks)e2e/libs/volume/volume.py
(2 hunks)e2e/tests/regression/test_replica.robot
(1 hunks)
🧰 Additional context used
🪛 Ruff
e2e/libs/volume/rest.py
140-140: Comparison to None
should be cond is not None
Replace with cond is not None
(E711)
144-144: Loop control variable i
not used within loop body
(B007)
183-183: Use node_name if node_name else replica.hostId
instead of replica.hostId if not node_name else node_name
Replace with node_name if node_name else replica.hostId
(SIM212)
237-237: Use node_name if node_name else 'all nodes'
instead of 'all nodes' if not node_name else node_name
Replace with node_name if node_name else 'all nodes'
(SIM212)
274-274: Use node_name if node_name else 'all nodes'
instead of 'all nodes' if not node_name else node_name
Replace with node_name if node_name else 'all nodes'
(SIM212)
275-275: Use node_name if node_name else 'all nodes'
instead of 'all nodes' if not node_name else node_name
Replace with node_name if node_name else 'all nodes'
(SIM212)
🔇 Additional comments (4)
e2e/libs/volume/volume.py (1)
104-106
: LGTM: Well-structured addition of replica deletion by name.
The new method enhances the API by providing more granular control over replica deletion, which is beneficial for testing scenarios requiring precise replica management.
e2e/libs/keywords/volume_keywords.py (1)
162-162
: LGTM! Code simplification improves maintainability.
The simplified implementation correctly delegates the wait operation to the volume class.
e2e/libs/volume/crd.py (1)
452-452
: LGTM! Good flexibility improvement.
Making the node_name
parameter optional improves the method's flexibility while maintaining backward compatibility.
e2e/libs/volume/rest.py (1)
149-156
: Good practice: Checking for concurrent replica rebuilding
The addition of checks to ensure that only one replica is rebuilding at a time enhances data consistency and prevents potential conflicts.
... 1. Set soft anti-affinity to `true`. | ||
... 2. Create a volume with 1 replica. | ||
... 3. Attach the volume and write a few hundreds MB data to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation inconsistency with implementation.
The documentation states "Create a volume with 1 replica" but the implementation creates a volume with 3 replicas.
Either update the documentation or adjust the implementation to match:
- ... 2. Create a volume with 1 replica.
+ ... 2. Create a volume with 3 replicas.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
... 1. Set soft anti-affinity to `true`. | |
... 2. Create a volume with 1 replica. | |
... 3. Attach the volume and write a few hundreds MB data to it. | |
... 1. Set soft anti-affinity to `true`. | |
... 2. Create a volume with 3 replicas. | |
... 3. Attach the volume and write a few hundreds MB data to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with this comment for making test description and test code more consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chriscchien, glad to hear that you agree. Updating the documentation step to match the implementation will ensure consistency. Here's the updated step:
... 2. Create a volume with 3 replicas.
Let me know if you need any further assistance.
And Write data to volume 0 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add validation for data write operation.
The test writes data but doesn't verify the write operation's success before proceeding with replica operations.
And Write data to volume 0
+And Verify data write succeeded on volume 0
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
And Write data to volume 0 | |
And Write data to volume 0 | |
And Verify data write succeeded on volume 0 |
def wait_for_replica_rebuilding_complete(self, volume_name, node_name=None): | ||
return self.volume.wait_for_replica_rebuilding_complete(volume_name, node_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Inconsistency found in node_name parameter handling
The verification reveals inconsistencies in the node_name parameter handling across implementations:
base.py
: All methods declarenode_name
as required parameterrest.py
: All methods correctly implementnode_name=None
as optionalcrd.py
: Onlywait_for_replica_rebuilding_complete
hasnode_name=None
, while other rebuilding methods have requirednode_name
volume.py
: All methods correctly implementnode_name=None
as optional
The base class and CRD implementation need to be updated to maintain consistency with the optional node_name parameter pattern.
🔗 Analysis chain
Add docstring and verify implementation consistency.
Add a docstring to explain the behavior with and without node_name parameter.
def wait_for_replica_rebuilding_complete(self, volume_name, node_name=None):
+ """Wait for replica rebuilding to complete.
+
+ Args:
+ volume_name: Name of the volume
+ node_name: Optional. If provided, wait for rebuilding completion on specific node.
+ If None, wait for any replica rebuilding completion.
+ """
return self.volume.wait_for_replica_rebuilding_complete(volume_name, node_name)
Let's verify that the underlying implementations in CRD and Rest classes handle the optional node_name consistently:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if CRD and Rest implementations handle optional node_name consistently
# Test: Search for the implementation of these methods in CRD and Rest classes
echo "Checking implementations in CRD and Rest classes..."
rg -A 10 "def (wait_for_replica_rebuilding_start|is_replica_rebuilding_in_progress|wait_for_replica_rebuilding_complete)" "e2e/libs/volume/"
Length of output: 7703
def delete_replicas(self, volume_name, count): | ||
replica_list = self.replica.get(volume_name, node_name="") | ||
replica_names = [replica['metadata']['name'] for replica in replica_list] | ||
for i in range(int(count)): | ||
logging(f"Deleting volume {volume_name} replica volume {replica_names[i]}") | ||
self.volume.delete_replica_by_name(volume_name, replica_names[i]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add input validation and error handling.
The method needs additional safeguards:
- Validate that count is a positive integer and doesn't exceed available replicas
- Add error handling for the delete operation
Consider this implementation:
def delete_replicas(self, volume_name, count):
+ try:
+ count = int(count)
+ if count <= 0:
+ raise ValueError("Count must be positive")
+
replica_list = self.replica.get(volume_name, node_name="")
replica_names = [replica['metadata']['name'] for replica in replica_list]
+ if count > len(replica_names):
+ raise ValueError(f"Cannot delete {count} replicas, only {len(replica_names)} available")
+
for i in range(int(count)):
logging(f"Deleting volume {volume_name} replica volume {replica_names[i]}")
- self.volume.delete_replica_by_name(volume_name, replica_names[i])
+ try:
+ self.volume.delete_replica_by_name(volume_name, replica_names[i])
+ except Exception as e:
+ logging(f"Failed to delete replica {replica_names[i]}: {str(e)}")
+ raise
+ except ValueError as e:
+ logging(f"Invalid input: {str(e)}")
+ raise
+ except Exception as e:
+ logging(f"Failed to delete replicas: {str(e)}")
+ raise
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def delete_replicas(self, volume_name, count): | |
replica_list = self.replica.get(volume_name, node_name="") | |
replica_names = [replica['metadata']['name'] for replica in replica_list] | |
for i in range(int(count)): | |
logging(f"Deleting volume {volume_name} replica volume {replica_names[i]}") | |
self.volume.delete_replica_by_name(volume_name, replica_names[i]) | |
def delete_replicas(self, volume_name, count): | |
try: | |
count = int(count) | |
if count <= 0: | |
raise ValueError("Count must be positive") | |
replica_list = self.replica.get(volume_name, node_name="") | |
replica_names = [replica['metadata']['name'] for replica in replica_list] | |
if count > len(replica_names): | |
raise ValueError(f"Cannot delete {count} replicas, only {len(replica_names)} available") | |
for i in range(int(count)): | |
logging(f"Deleting volume {volume_name} replica volume {replica_names[i]}") | |
try: | |
self.volume.delete_replica_by_name(volume_name, replica_names[i]) | |
except Exception as e: | |
logging(f"Failed to delete replica {replica_names[i]}: {str(e)}") | |
raise | |
except ValueError as e: | |
logging(f"Invalid input: {str(e)}") | |
raise | |
except Exception as e: | |
logging(f"Failed to delete replicas: {str(e)}") | |
raise |
def delete_replica_by_name(self, volume_name, replica_name): | ||
replica = self.obj_api.get_namespaced_custom_object( | ||
group="longhorn.io", | ||
version="v1beta2", | ||
namespace="longhorn-system", | ||
plural="replicas", | ||
name=replica_name | ||
) | ||
logging(f"Deleting replica {replica['metadata']['name']}") | ||
self.obj_api.delete_namespaced_custom_object( | ||
group="longhorn.io", | ||
version="v1beta2", | ||
namespace="longhorn-system", | ||
plural="replicas", | ||
name=replica['metadata']['name'] | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling and optimize replica deletion.
The implementation could be improved in the following ways:
- Add error handling for non-existent replicas
- Remove redundant metadata access since we already have the replica name
Consider this implementation:
def delete_replica_by_name(self, volume_name, replica_name):
- replica = self.obj_api.get_namespaced_custom_object(
- group="longhorn.io",
- version="v1beta2",
- namespace="longhorn-system",
- plural="replicas",
- name=replica_name
- )
- logging(f"Deleting replica {replica['metadata']['name']}")
- self.obj_api.delete_namespaced_custom_object(
- group="longhorn.io",
- version="v1beta2",
- namespace="longhorn-system",
- plural="replicas",
- name=replica['metadata']['name']
- )
+ try:
+ logging(f"Deleting replica {replica_name}")
+ self.obj_api.delete_namespaced_custom_object(
+ group="longhorn.io",
+ version="v1beta2",
+ namespace="longhorn-system",
+ plural="replicas",
+ name=replica_name
+ )
+ except ApiException as e:
+ if e.status == 404:
+ logging(f"Replica {replica_name} not found")
+ else:
+ raise e
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def delete_replica_by_name(self, volume_name, replica_name): | |
replica = self.obj_api.get_namespaced_custom_object( | |
group="longhorn.io", | |
version="v1beta2", | |
namespace="longhorn-system", | |
plural="replicas", | |
name=replica_name | |
) | |
logging(f"Deleting replica {replica['metadata']['name']}") | |
self.obj_api.delete_namespaced_custom_object( | |
group="longhorn.io", | |
version="v1beta2", | |
namespace="longhorn-system", | |
plural="replicas", | |
name=replica['metadata']['name'] | |
) | |
def delete_replica_by_name(self, volume_name, replica_name): | |
try: | |
logging(f"Deleting replica {replica_name}") | |
self.obj_api.delete_namespaced_custom_object( | |
group="longhorn.io", | |
version="v1beta2", | |
namespace="longhorn-system", | |
plural="replicas", | |
name=replica_name | |
) | |
except ApiException as e: | |
if e.status == 404: | |
logging(f"Replica {replica_name} not found") | |
else: | |
raise e |
|
||
When Update volume 0 replica count to 5 | ||
Then Only one replica rebuilding will start at a time for volume 0 | ||
And Wait until volume 0 replicas rebuilding completed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to change the keyword name Wait until volume 0 replicas rebuilding is completed
to emphasize that it ensures only one replica in the rebuild (WO) state until the rebuild is completed, for better match test case name Test Replica Rebuilding Per Volume Limit
? Beside this, no further questions. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chriscchien Could you suggest an appropriate name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about Only one replica for volume 0 rebuilds at a time until the rebuild is completed
for below two?
Then Only one replica rebuilding will start at a time for volume 0
And Wait until volume 0 replicas rebuilding completed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chriscchien How about adding another step Monitor only one replica rebuilding will start at a time for volume 0
:
Then Only one replica rebuilding will start at a time for volume 0
And Monitor only one replica rebuilding will start at a time for volume 0
And Wait until volume 0 replicas rebuilding completed
This aligns with the description in the test skeleton.
Signed-off-by: Yang Chiu <[email protected]>
77fab67
to
cf7e52d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (7)
e2e/tests/regression/test_replica.robot (2)
16-21
: Add documentation for test variables.Consider adding comments to explain the significance of these values, particularly:
- Why
LOOP_COUNT
is set to 1- The reasoning behind 300 retries with 1-second intervals
- What
DATA_ENGINE=v1
represents*** Variables *** +# Number of test iterations ${LOOP_COUNT} 1 +# Maximum attempts to wait for volume operations (5 minutes total) ${RETRY_COUNT} 300 +# Interval between retry attempts in seconds ${RETRY_INTERVAL} 1 +# Storage engine version to use ${DATA_ENGINE} v1
23-34
: Documentation needs additional clarity.While the test steps are well-documented, consider adding:
- Expected outcomes for each step
- Explanation of why 5 replicas are chosen in step 4
- Clarification on the data size mentioned in step 3 ("few hundreds MB")
e2e/keywords/volume.resource (2)
102-104
: Add documentation for the new keyword.The implementation looks good, but adding documentation would improve maintainability.
Add documentation to describe the keyword's purpose and parameters:
+[Documentation] Deletes the specified number of replicas from a volume +... ${count} - Number of replicas to delete +... ${volume_id} - ID of the volume Delete ${count} replicas of volume ${volume_id} ${volume_name} = generate_name_with_suffix volume ${volume_id} delete_replicas ${volume_name} ${count}
203-206
: Add documentation for the replica rebuilding constraint keyword.The implementation correctly validates the "one at a time" rebuilding constraint.
Add documentation to clarify the keyword's purpose:
+[Documentation] Validates that only one replica rebuilding process occurs at a time for the specified volume +... ${volume_id} - ID of the volume to monitor Only one replica rebuilding will start at a time for volume ${volume_id} ${volume_name} = generate_name_with_suffix volume ${volume_id} only_one_replica_rebuilding_will_start_at_a_time ${volume_name}e2e/libs/volume/rest.py (3)
140-140
: Fix comparison with None.Use
is not None
instead of!= None
for Python identity comparison.- assert rebuilding_replica_name != None, f"Waiting for replica rebuilding start for volume {volume_name} on node {node_name} failed: replicas = {v.replicas}" + assert rebuilding_replica_name is not None, f"Waiting for replica rebuilding start for volume {volume_name} on node {node_name} failed: replicas = {v.replicas}"🧰 Tools
🪛 Ruff
140-140: Comparison to
None
should becond is not None
Replace with
cond is not None
(E711)
183-183
: Simplify node_name assignment.The current expression can be simplified for better readability.
- node_name = replica.hostId if not node_name else node_name + node_name = node_name if node_name else replica.hostId🧰 Tools
🪛 Ruff
183-183: Use
node_name if node_name else replica.hostId
instead ofreplica.hostId if not node_name else node_name
Replace with
node_name if node_name else replica.hostId
(SIM212)
237-237
: Simplify string interpolation in log messages.Multiple instances of string interpolation can be simplified for better readability.
- logging(f"wait for {volume_name} replica rebuilding completed on {'all nodes' if not node_name else node_name} ... ({i})") + logging(f"wait for {volume_name} replica rebuilding completed on {node_name if node_name else 'all nodes'} ... ({i})") - logging(f"Completed volume {volume_name} replica rebuilding on {'all nodes' if not node_name else node_name}") + logging(f"Completed volume {volume_name} replica rebuilding on {node_name if node_name else 'all nodes'}") - assert completed, f"Expect volume {volume_name} replica rebuilding completed on {'all nodes' if not node_name else node_name}" + assert completed, f"Expect volume {volume_name} replica rebuilding completed on {node_name if node_name else 'all nodes'}"Also applies to: 274-274, 275-275
🧰 Tools
🪛 Ruff
237-237: Use
node_name if node_name else 'all nodes'
instead of'all nodes' if not node_name else node_name
Replace with
node_name if node_name else 'all nodes'
(SIM212)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
e2e/keywords/volume.resource
(3 hunks)e2e/libs/keywords/volume_keywords.py
(3 hunks)e2e/libs/volume/crd.py
(2 hunks)e2e/libs/volume/rest.py
(4 hunks)e2e/libs/volume/volume.py
(2 hunks)e2e/tests/regression/test_replica.robot
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- e2e/libs/keywords/volume_keywords.py
- e2e/libs/volume/crd.py
- e2e/libs/volume/volume.py
🧰 Additional context used
🪛 Ruff
e2e/libs/volume/rest.py
140-140: Comparison to None
should be cond is not None
Replace with cond is not None
(E711)
144-144: Loop control variable i
not used within loop body
(B007)
183-183: Use node_name if node_name else replica.hostId
instead of replica.hostId if not node_name else node_name
Replace with node_name if node_name else replica.hostId
(SIM212)
237-237: Use node_name if node_name else 'all nodes'
instead of 'all nodes' if not node_name else node_name
Replace with node_name if node_name else 'all nodes'
(SIM212)
274-274: Use node_name if node_name else 'all nodes'
instead of 'all nodes' if not node_name else node_name
Replace with node_name if node_name else 'all nodes'
(SIM212)
275-275: Use node_name if node_name else 'all nodes'
instead of 'all nodes' if not node_name else node_name
Replace with node_name if node_name else 'all nodes'
(SIM212)
🔇 Additional comments (5)
e2e/tests/regression/test_replica.robot (3)
1-15
: LGTM: Well-structured test setup with comprehensive resources.
The test framework setup includes all necessary resources and proper test lifecycle hooks.
35-51
: Enhance test robustness with additional validations.
- Consider adding volume state validation after each major operation:
When Update volume 0 replica count to 5
+And Verify volume 0 is attached
Then Only one replica rebuilding will start at a time for volume 0
- As suggested in the previous review, consider renaming the keywords to better reflect their purpose:
-Then Only one replica rebuilding will start at a time for volume 0
-And Wait until volume 0 replicas rebuilding completed
+Then Only one replica for volume 0 rebuilds at a time until the rebuild is completed
+And Wait for volume 0 replica rebuild completion
1-51
: Verify complete migration from pytest.
Let's ensure all functionality from the original pytest test has been properly migrated.
✅ Verification successful
Migration verification completed - test functionality matches
The Robot Framework test implementation successfully covers all the key functionality from the original pytest version:
-
Both tests verify the same core scenario of one-replica-at-a-time rebuilding:
- Setting soft anti-affinity
- Creating volume with initial replicas
- Writing test data
- Scaling to 5 replicas
- Monitoring rebuild states
- Deleting 4 replicas
- Re-verifying rebuild behavior
- Data integrity check
-
The Robot test actually improves the verification by adding explicit steps to:
- Monitor the single-replica rebuilding constraint
- Wait for rebuild completion
- Verify volume health
The migration has maintained full test coverage while making the test steps more explicit and readable in Robot Framework's format.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Compare pytest and robot test implementations
# to ensure complete migration
# Find the original pytest file
echo "Looking for original pytest file..."
rg -l "test_replica_rebuild_per_volume_limit" --type py
# Search for any remaining pytest references to this test
echo "Checking for remaining pytest references..."
rg "test_replica_rebuild_per_volume_limit" --type py -C 5
Length of output: 1172
Script:
#!/bin/bash
# Get the full implementation of the pytest test for comparison
echo "Getting pytest implementation..."
rg -A 50 "def test_replica_rebuild_per_volume_limit" manager/integration/tests/test_scheduling.py
# Check for any helper functions used in the pytest implementation
echo "Checking for helper functions..."
rg "def (check_volume_data|cleanup_volume|wait_for_rebuild_complete)" manager/integration/tests/test_scheduling.py -A 5
Length of output: 2396
e2e/libs/volume/rest.py (2)
Line range hint 121-171
: LGTM! Good addition of concurrent rebuilding check.
The changes to make node_name
optional and the addition of monitoring for concurrent rebuilding (lines 149-155) improve the robustness of the test framework.
🧰 Tools
🪛 Ruff
140-140: Comparison to None
should be cond is not None
Replace with cond is not None
(E711)
144-144: Loop control variable i
not used within loop body
(B007)
234-270
: LGTM! Well-documented rebuild status handling.
The changes effectively handle both node-specific and all-node rebuilding scenarios. The comments about rebuildStatus
reliability (lines 251-258) provide valuable context for future maintainers.
🧰 Tools
🪛 Ruff
237-237: Use node_name if node_name else 'all nodes'
instead of 'all nodes' if not node_name else node_name
Replace with node_name if node_name else 'all nodes'
(SIM212)
Monitor only one replica rebuilding will start at a time for volume ${volume_id} | ||
${volume_name} = generate_name_with_suffix volume ${volume_id} | ||
wait_for_replica_rebuilding_to_complete ${volume_name} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keyword implementation may not match its intended purpose.
The keyword name suggests monitoring that only one replica rebuilds at a time, but the implementation only waits for rebuilding completion. This might not effectively validate the "one at a time" constraint.
Consider implementing proper monitoring logic:
Monitor only one replica rebuilding will start at a time for volume ${volume_id}
${volume_name} = generate_name_with_suffix volume ${volume_id}
- wait_for_replica_rebuilding_to_complete ${volume_name}
+ only_one_replica_rebuilding_will_start_at_a_time ${volume_name}
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Which issue(s) this PR fixes:
Issue longhorn/longhorn#9774
What this PR does / why we need it:
migrate test_replica_rebuild_per_volume_limit
Special notes for your reviewer:
Additional documentation or context
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Bug Fixes
Tests